Learning Representations for Multimodal Data with Deep Belief Nets

نویسنده

  • Nitish Srivastava
چکیده

We propose a Deep Belief Network architecture for learning a joint representation of multimodal data. The model defines a probability distribution over the space of multimodal inputs and allows sampling from the conditional distributions over each data modality. This makes it possible for the model to create a multimodal representation even when some data modalities are missing. Our experimental results on bi-modal data consisting of images and text show that the Multimodal DBN can learn a good generative model of the joint space of image and text inputs that is useful for filling in missing data so it can be used both for image annotation and image retrieval. We further demonstrate that using the representation discovered by the Multimodal DBN our model can significantly outperform SVMs and LDA on discriminative tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multimodal Learning with Deep Boltzmann Machines

A Deep Boltzmann Machine is described for learning a generative model of data that consists of multiple and diverse input modalities. The model can be used to extract a unified representation that fuses modalities together. We find that this representation is useful for classification and information retrieval tasks. The model works by learning a probability density over the space of multimodal...

متن کامل

Extensive Deep Belief Nets with Restricted Boltzmann Machine Using MapReduce Framework

Big data is a collection of data sets which is used to describe the exponential growth and availability of both ordered and amorphous data. It is difficult to process big data using traditional data processing applications. In many practical problems, deep learning is one of the machine learning algorithms that has received great popularity in both academia and industry due to its high-level ab...

متن کامل

Deep Belief Nets for Topic Modeling Workshop on Knowledge-Powered Deep Learning for Text Mining (KPDLTM-2014)

Applying traditional collaborative filtering to digital publishing is challenging because user data is very sparse due to the high volume of documents relative to the number of users. Content based approaches, on the other hand, is attractive because textual content is often very informative. In this paper we describe large-scale content based collaborative filtering for digital publishing. To ...

متن کامل

Multimodal DBN for Predicting High-Quality Answers in cQA portals

In this paper, we address the problem for predicting cQA answer quality as a classification task. We propose a multimodal deep belief nets based approach that operates in two stages: First, the joint representation is learned by taking both textual and non-textual features into a deep learning network. Then, the joint representation learned by the network is used as input features for a linear ...

متن کامل

Deep Dynamic Neural Networks for Gesture Segmentation and Recognition

The purpose of this paper is to describe a novel method called Deep Dynamic Neural Networks(DDNN) for the Track 3 of the Chalearn Looking at People 2014 challenge [1]. A generalised semi-supervised hierarchical dynamic framework is proposed for simultaneous gesture segmentation and recognition taking both skeleton and depth images as input modules. First, Deep Belief Networks(DBN) and 3D Convol...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012